Log-normal distribution for correlators in lattice QCD? 
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Abstract 

Many hadronic correlators used in spectroscopy calculations in lattice QCD simulations appear 
to show a log-normal distribution at intermediate time separations. 
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Recently, while performing numerical simulations of unitary fermion gases, the authors 
of Refs. [1K7| discovered that spectroscopic correlation functions of operators separated by 
a Euclidean time t, call them generically C(t), show a log-normal distribution. In a review 
article they presented high statistics plots of the distribution of propagator values of 
a Lambda-Lambda dibaryon state 0, which also show a beautiful Gaussian structure for 
log C(i). The width of the Gaussian increases roughly linearly with t. I have been doing 
simulations of quenched baryon spectroscopy in larger- N SU(N) gauge field backgrounds, 
and I see the same thing, although with lower statistics: compare Fig. [TJ 

Unitary Fermi gases, Lambda-Lambda dibaryons, and large-N baryons are rather exotic 
objects for lattice study, and the question naturally arises, how common are log-normal 
distributions in lattice spectroscopy? I believe that they are ubiquitous. I observe them in 
the following data sets: 

• Meson and baryon spectroscopy, and string tension data from Wilson loops, in 
quenched SU(3), SU(5) and SU(7) simulations at a lattice spacing of about 0.1 fm 

• Nf = 2 flavor dynamical simulations at a similar lattice spacing 

• Quenched SU(3) simulations with the Wilson action at j5 — 5.9 and 6.1 and overlap 
valence fermions 

• Simulations in the weak coupling phase of SU(3) gauge theory with two flavors of 
sextet-representation fermions 

This is a qualitative observation. I do not know why it occurs, how general it might be, 
nor what it is good for. In order not to make the paper too long, and to avoid being too 
redundant, I will only show pictures from quenched QCD. 

Let's set some definitions. With the nth moment of a set of random variables Xi (i = 1 
to N) as 

«n = (x n ) , (1) 
the nth cumulant K n is defined recursively as 



n-1 
m=l 



ftn ^ ^ ( 7TT, 1 ) tim ^ n ~ m ' (2) 



The objects of our attention are some set of expectation values of correlation functions of 
pairs of operators O 

C(t) = J2Oi(x,t)O m (0,0) (3) 

X 

generated in a Monte Carlo simulation, a set of random variables, Cj(i) for the ith measure- 
ment. Their falloff with t gives mass values. 

If the operators are built of fermion propagators (such as for a meson or baryon propa- 
gator), lattice symmetries (charge conjugation plus 75 Hermiticity) tell us that the real part 
of C(t) carries the signal. In an infinite ensemble the imaginary part of C(t) would average 
to zero. So I will only consider sets of real variables Ci(t). The x's will be the logarithms of 
C(t). At small and intermediate t all the C(i)'s in any data set have the same sign (positive, 
by definition). At the largest times, some of the C(t)'s in some correlation functions can 
fluctuate negative. When I calculate the cumulants of logC(t), I will simply discard these 
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wrong-sign entries from my analysis, and when I show a result I will report the number of 
discarded configurations. 

I consider mesonic and baryonic correlation functions, and Wilson loops. The correlators 
I show which involve quark propagators use clover fermions with links smeared using nor- 
malized hypercubic (nHYP) smearing jj^, 10]. The clover coefficient is set to unity. Most of 



my spectroscopic sets use an extended source (typically, a Gaussian product state is used as 
the source of the fermion propagator) and point sinks, projected onto zero three momentum. 
Some data sets use zero-momentum Gaussian sinks as well. The first class of correlators 
is not variational. The size of the source has typically been tuned to produce flat plateaus 
in effective mass plots. These are completely standard data sets for lattice simulations, 
although the largest lattice volumes are small by today's standards, 16 3 x 48 sites. 

The other set of correlators I consider are Wilson loops, used to compute the heavy quark 
potential. These are real quantities and should all have the same sign. Fluctuations can 
drive them negative and I will treat this situation as I do for mesonic or baryonic correlators. 
The loops come from lattice configurations which were nHYP smeared and gauge fixed to 
axial gauge. 

I observe that over a wide range in t the second cumulant K2 is much greater than the 
higher (n > 2) cumulants. If a distribution is Gaussian, its first and second cumulants (the 
mean and standard deviation) are the only non-vanishing ones, so this ordering of moments 
means that the distributions of C(t) are approximately log-normal. I also observe the same 
ordering of size of the mth moment of the correlator M n , defined as a power of the original 
correlation function 

M n (t) = (C(t)r. (4) 

Moments of a log-normally distributed variable are also log-normal. Finally, I observe, like 
Refs. [343], that the second cumulant of logC(t), K2, increases roughly linearly with time t. 
Log-normal behavior is most prominent at short and intermediate distances, but these are 
distances where effective mass plots are roughly constant, where one would take masses to 
publish as results. 

Let us look at some examples. I begin with a data set of 80 16 3 x 32 quenched SU(3) 
lattices at = 6.0175. Hadronic correlators from this data set, at one k corresponding to an 
Axial Ward Identity (AWI) quark mass in lattice units of about am q = 0.055, are shown in 
Fig. [2j The errors on the K n 's come from a jackknife. Observe that K2 is much larger than 
the other K n 's and increases linearly with t. 

Fig. [3] shows a set of cumulants from moments. The higher cumulants become quite noisy. 
The second moment of the square of the pseudoscalar propagator, the square of the delta 
propagator, and the cube of the delta propagator all increase with t and they either dominate 
the other moments or remain the only cumulant with statistically significant signal. 

Fig. H] shows plots of K n vs t for several Wilson loops of temporal extent t from this 
data set. Distances of t = 4 — 6 are the range from which potentials may begin to be be 
reliably extracted. Cumulants for the second and third moments of the (1, 1, 1) loop (panel 
(b) of Fig. @J are shown in Fig. [5j Evidently, these Wilson loop expectation values are also 
log-normal distributed. 

Log normal distributions present a contradiction with a well-known expectation of the 
noise in correlators, which goes back to Lepage [11]. The (over) simplified version of the 
explanation is that while the signal Cn{t) decays as ~ exp(— m#t), the noise in the channel 
involves the exponential decay of the absolute square of the correlator 

a 2 (t) ~ \C(t)\ 2 = (0||O(t)| 2 |O(0)| 2 |0) ~ exp(-m 2 t) (5) 
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FIG. 1: Histogram of values of logC(t) for the propagator of a J = 7/2 baryon in SU(7). Panels 
(a), (b), and (c) show results for t = 4, 6, and 8 respectively. 

where m2 is the lightest state which can be created by the squared operator. For the 
pseudoscalar or the rho, the lightest state is the two-pseudoscalar state and a(t)/C(t) should 
be roughly a constant for the pseudoscalar, roughly increasing exponentially as exp((m p — 
m 7r ))t for the rho. (The energy of two particle states in a box includes an interaction 
term[12], which will reappear below.) For the (N color) baryon correlator, two different 
classes of behavior are expected for the moment s[l .3]: when the moment number n is even, 
the correlator should couple to nN/2 pseudoscalars and when n is odd, the lightest state 
will be a single baryon plus (n — l)N/2 pseudoscalars. Sometimes the squared correlator 
can couple to the vacuum, in which case u 2 {t) would be a constant. This is the situation for 
the scalar glueball mass or any Wilson loop. 

Now consider the situation for a log-normal correlator. The average value of the nth 
moment is 

(x n ) = exp(n«i + —k 2 ). (6) 
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FIG. 2: Cumulants of log C(i) for various smeared-to-point hadronic correlators of temporal extent 
t, from quenched SU(3) simulations at /3 = 6.0175, k = 0.125. (a) pseudoscalar, (b) vector, (c) 
proton (d) Delta. Labels are octagons for squares for K3, diamonds for K4, crosses for K5. All 
correlators are positive at all t apart from one Delta correlator at t = 10. 



The correlators C(t) decrease with t proportional to exp(— Mt). This says that both K\ and 
k 2 should be linear functions of t, which is what we have seen. Call 



K 2 {t) = tS + S . 

We can define an effective mass from the logarithm of the ratio, 

M=-hg(C(t + l)}/(C(t)}. 
Eq. [6] tells us that the mass associated with the nth moment is 



(7) 
(8) 



M n = n Ml - n{n - l) S 



(9) 
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FIG. 3: Cumulants of log M n (t) for moments of various smeared-to-point hadronic correlators of 
temporal extent t, from quenched SU(3) simulations at (3 = 6.0175, k = 0.125. (a) square of the 
pseudoscalar correlator, (b) square of the Delta (c) cube of the Delta. Labels are octagons for K2, 
squares for K3, diamonds for K4, crosses for K5 (baryons only; these are very noisy for the mesons). 



that is, the log-normal distribution implies a pairwise interaction of the constituents of the 
nth moment. This is clearly inconsistent with Lepage-like behavior. 

We can compare the two expectations for correlators. With the correlators in hand, just 
construct the correlation functions by averaging powers of the C(t)'s, the nth moments, 
M n (t) and directly measure the effective mass of M n (t). 

In Fig. El I compare the highest-spin baryon in SU(N), N = 3, 5, 7. The bare couplings 
have been tuned to match the lattice spacings. Panel (a) shows moments of the SU(3) delta. 
The lightest state is just the delta itself: its mass (in lattice units) is about 0.8. At its k 
value the lattice pseudoscalar mass is 0.35, so the second moment (the octagons) should 
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FIG. 4: Cumulants of logC(i) for various Wilson loops of temporal extent t, from quenched 577(3) 
simulations at /3 = 6.0175. (a) r = 2 planar loop; (b) r = (1,1,1) loop; (c) r = 3a/2 loop; (d) 
r = 6a/2 loop. Labels are octagons for squares for «3, diamonds for K4, crosses for K5. All 
correlators are positive at all t. 

asymptote to a mass of 3 x 0.35 = 1.05. Instead, it sits at roughly twice the delta's mass. 
The second moment does not show a mass which is the sum of the delta mass plus three 
times the pseudoscalar, 1.85; it sits at roughly three times the delta mass. 

For 577(5) (panel (b)), the situation is similar. Again the lattice pseudoscalar mass is 
0.35. The baryon mass is about 1.5 in lattice units. The nth moment's effective mass is 
roughly just n times the baryon mass over a wide t range. At large t the masses tail over 
toward the Lepage formula. This is a soft statement, because the quality of the fit has 
deteriorated and it may be that the signal is just overwhelmed by noise, but it is certainly 
plausible. Note that this behavior occurs at much larger t than where the baryon's effective 
mass has gone to a plateau. 

The situation for SU (7) (panel (c)) is again similar. (Here, the pseudoscalar mass is 0.4 in 
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FIG. 5: Cumulants of logM n (t) from quenched SU(3) simulations at f3 = 6.0175. for the (a) 
second and (b) third moments of the r = (1, 1, 1) Wilson loop. Labels are octagons for k 2 , squares 
for K3, diamonds for K4, crosses for K5. 



lattice units). Apparently Eq. [5]is only an asymptotic result. This is no surprise: the simple 
story was too simple. The correlator couples to everything with its quantum numbers, not 
just the lightest state: 

exp(-mjt) (10) 

3 

where rrij can include the n— baryon state. Presumably this is a dominant state, since some 
attempt was made to optimize the operators to produce a single baryon state in C(t). So 
the asymptotic form may appear only at very late time. 

Let's next test Eq. [9j I just take effective masses and, under a jackknife, compute AM = 
nM — M n . This should be linear in n{n + 1), and the slope should be given by the part 
of k 2 for logC(t) which is linear in t. Fig. [7] shows this behavior quite nicely for hadron 
correlators in SU(3). The line is a fit to S (see Eq. ^ over the range 3 < t < 8. 

Recall panel (b) of Fig. [6j showing the evolution of mass parameters at large t. Fig. [8] 
shows cumulants and the mass splitting for our SU(5) J = 5/2 state. Log-normal behavior 
works well at shorter t and fails at the largest t. 

Finally, we return to potentials. Figs. HJ M and [10] show the consistency of log-normal 
behavior (dominant k 2 , effective masses scaling as in Eq. [9]) at short distances, and when 
the effective mass for the moments falls, dominance of k 2 goes away. 

So to summarize: At small and intermediate t, hadronic correlators show log-normal 
behavior. This is the t range where the Lepage formula does not describe the effective mass 
of the moments of C(t). At large t, the Lepage formula does appear to describe the effective 
mass of the moments and correlators cease to be log-normal. 

As a last observation, we can ask about volume dependence. I have two volumes for some 
of my quenched data sets. Figs. [TTirT2l show that the S parameter, the slope of k 2 with t, 
often scales inversely with the simulation volume. I do not have enough other data sets to 
say more about this. 

Are there any consequences of this observation? I can think of two. 

First, the authors of Refs. have shown that, for their data sets, noisy signals can 
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FIG. 6: Effective mass for moments of the highest-spin baryon (higher moments lie higher, so the 
baryon effective mass is given by crosses, the effective mass of the squared correlator is given by 
octagons, for the cubed correlator, by squares, and the fancy diamonds are for M4): (a) SU(3), 
k = 0.125; (b) SU(5), k = 0.1265; (c) SU(7), k = 0.128. 



be tamed by replacing the average correlator by a truncated cumulant sum, 



v 



n=l 



(11) 



Truncation of the sum at some finite N introduces a systematic error on the mass, but it 
might be less than the statistical error associated with averaging the original data set. This 
might give a prediction for M with a small statistical error. Varying N and refitting would 
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FIG. 7: Effective mass differences AM n = nM\ — M n hadronic correlators in quenched SU{3), 
P = 6.0175, k = 0.125. Symbols are octagons for t = 4, squares for t = 6, diamonds for t = 8 and 
fancy diamonds for t = 10. The line is a fit to the slope of K2 for 3 < t < 8. (a) pseudoscalar; (b) 
vector meson; (c) proton; (d) delta. 

allow an estimation of the systematic error. Because the cumulants involve all the data, 
it would be necessary to fold this procedure into a jackknife or bootstrap, and take the 
uncertainty in the fit parameters from the jackknife or bootstrap average. 

An immediate problem doing this is that correlators at different time steps are themselves 
strongly correlated. Usual fits take this correlation into account in the construction of the 
correlation matrix for the chi-squared function. Information about time autocorrelations is 
lost when the cumulant sum is performed time step by time step. If one is doing an effective 
mass fit with Eq. [HI these correlations do not affect the mean value of M because the fit 
has no degrees of freedom. They do affect the uncertainties on the mass and intercept, but 
presumably a jackknife can handle that. However, serious fits to lattice data are typically 
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FIG. 8: (a) Cumulants and (b) mass differences for the SU(5) J = 5/2 baryon. In panel (a), K2, 
K3 and K4 are shown by octagons, squares, and diamonds. In panel (b), mass differences at t = 4, 
6, 8 , and 10 are shown as octagons, squares, diamonds and fancy diamonds. The line is a fit to S. 



"range fits" over many values of t. Then correlations are important. The author has seen 
many fits which miss the central values of the individual C(t)'s in an asymmetric way, due 
to the off-diagonal correlations in the data. 

For my quenched and dynamical data sets of meson and baryon correlators. I compared 
conventional effective mass and range fits to fits where C(t) was replaced by a truncated 
cumulant sum. Even a truncation ending with K2 produced fit masses consistent with the 
usual fits. However, unlike what Refs. jlH?]] found, my uncertainties are not improved using 
the truncated cumulant sum. I show results from a quenched SU (3) example in Fig. [13j 
Since the uncertainties are what I want to show, I offset the various orders of the truncated 
sum by constants. My observations are of course not a blanket statement that the truncated 
cumulant cannot be used to improve fits, only that I could not do it. 

Second, one could take Eq. M seriously: the change in the width of the second cumulant 
measures a mass difference between an n hadron state in finite volume from n times the 
single-hadron state. (This connection was first made in the context of unitary Fermi gases 



by Nicholson 14|.) I n QCD, the mass difference gives information on a scattering length a: 



Aira , , 

AM n = 12 

n ML 3 V ; 

for particles in a box of volume V. Does the data support this? We can test this hypotheses 
by taking the values of S from different volumes and just overlaying L 3 S. Examples were 
already shown, in Figs. [TT] and [T2J Sometimes the volume dependence is there. 

Curious as it is, this connection cannot be made more precise in QCD. The n-hadron 
correlation functions from which my masses are extracted (the moments) are composed of 
n distinct color traces. These correlations functions do not project onto a unique isospin, 
and so the right-hand side of Eq. [12] is - at best - a weighted sum of scattering lengths in 
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FIG. 9: Effective mass for moments of potentials in quenched SU(3), f3 = 6.0175. Higher moments 
lie higher, (a) r = 2 planar loop; (b) f = (1, 1, 1) loop; (c) r = 3\/2 loop; (d) r = 6v2 loop. 



different isospin channels. And in QCD, three-body interactions have been measured by at 
least one group 15|. They are not zero. 

Since I do not have any crisp conclusions, I will finish with some questions. First, is 
this behavior really ubiquitous? It would be very interesting to look at distributions of 
correlation functions for two cases for which I don't have data: One is simulation data from 
large lattices and small quark masses, where the lightest states in a generic channel are not 
single-particle states, but multi-body ones. The rho channel when m p 3> 2m n is an example. 
Another example would be correlation functions of operators which are highly tuned (say 
from a variational calculation) to project on a single state. I am also not satisfied with my 
comparisons of different volumes. 

Second, if log- normality is there, why is it there? Since I see it in so many channels, it can't 
be a consequence of the kind of correlator (baryon versus meson) or system (confining versus 
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FIG. 10: Effective mass differences AM n = nM\ — M n from Wilson loops in quenched SU(3), 
(3 = 6.0175. Symbols are octagons for t = 4, squares for t = 6, diamonds for t = 8 and fancy 
diamonds for t = 10. (a) r = 2 planar loop; (b) f = (1, 1, 1) loop; (c) r = 3\/2 loop; (d) r = 6\/2 
loop. 



conformal). Log- normal distributions commonly arise when an observable is a product of a 
set of a set of independent positive random numbers. The variables in a lattice simulation 
of QCD are matrices, not numbers, and they are not completely random either - the action 
weights the likelihood of a configuration. One thing that all the correlators I have examined 
have in common is that they involve products of link matrices, and the number of link 
matrices involved in a correlator increases with its t value. This is certainly the case for 
Wilson loops. Hadronic correlators are built of quark propagators. Quark propagators are 
themselves sums of products of link variables connecting the source and the sink points of 
the propagator. Because of the additive property of cumulants, the cumulants of the log 
of the product of r random variables are equal to r times the cumulant of the individual 
distributions. This certainly has the flavor of the linear increase in K 2 (t) observed in the 
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FIG. 11: Parameter S from the slope of k>2 with t, from Eq. [7] for pseudoscalar correlators from 
quenched simulations. Squares 16 3 volume; octagons, 12 3 volume, (a) SU(3) (b) SU(5) (c) SU(7). 
The x axis is the AWI quark mass and all three data sets are matched in lattice spacing. 

data. 

And finally, can log-normality be used to do anything quantitative? So far, I have not 
been able to use it to improve mass determinations along the lines of [l|-0|. 

As I said at the start, I am not sure whether approximate log- normality in lattice corre- 
lator data is useful for anything. However, I have to say: I have been looking at lattice data 
for a long time, and it is quite curious to observe something new and (apparently) generic 
in it. 
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FIG. 12: Parameter S from the slope of K2 with t, from Eq. [7J scaled by the spatial simulation 
volume, for the highest-spin baryon from quenched simulations. Squares 16 3 volume; octagons, 
12 3 volume, (a) SU(3) (b) SU(5) (c) SU(7). The x axis is the AWI quark mass and all three data 
sets are matched in lattice spacing. 
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show the conventional fit result. Octagons, squares and diamonds truncate the sum at n = 2, 3, 
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